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(54) Method and system for predicting pharmacokinetic properties 



(57) This invention provides a method for predicting 
pharmacokinetic properties of molecules comprising the 
steps of: 

(a) preparing 2D-structures of molecules used as a 
training set; 

(b) constructing a 2D-fingerprint by counting the 
number of structural descriptors that potentially re- 
late to a pharmacokinetic property, either manually 
or automatically using internally developed macro; 
wherein said structural descriptors consist of pre- 
defined 20 to 80 atoms/fragments or substructures; 

(c) analyzing the obtained 2D-fingerprint by a sta- 
tistical analysis method to correlate with the phar- 



macokinetic property of the molecule to yield a 
quantitative structure-property relationship (QSPR) 
model; and 

(d) calculating the pharmacokinetic property of a tri- 
al molecule using the above obtained QSPR model. 

A system for this invention is also provided. Accord- 
ing to this method and system, it is possible to predict 
pharmacokinetic properties of molecules prior to syn- 
thesis, without labor-intensive and time-consuming ex- 
perimentation. 
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Description 
Technical Field 

5 [0001] This invention relates to a method and system to predict pharmacokinetic (ADME) properties such as drug 
absorption (permeability), distribution, metabolism, and excretion, which are crucial properties in drug discovery 

Background Art 

10 [0002] Experimental measurements to obtain pharmacokinetic properties are time-consuming and labor-intensive. 
Moreover experiments require a significant amount of actual compounds. Thus, the computational methods to predict 
such properties of virtual compounds are highly desirable in prioritization of targets prior to synthesis. 
[0003] So far, similar descriptors as conventionally employed in the quantitative structure activity relationship (QSAR) 
analysis (steric bulk, lipophilicity, HOMO energy, etc.) have been adopted in quantitative structure property relationship 

is (QSPR) analysis to correlate with PK-related parameters (t1/2, clearance, or oxidation rate, etc.) (Lien, E. J. eta!. Acta 
Pharm. Jugosl. 1984, 34, 123—131 ; Baeaemhietm, C. eta!. Chem.-Biol. interact 1986, 58, 277—288). Graph theory 
derived parameters (molecular connectivity indexes, etc.) have been also used for this purpose (Markin, R. S. et al. 
Pharm. Res. 1988, 5, 201—208; Garcia-March, F. J. et al J. Pharm. Pharmacol. 1995, 47, 232—236). Recently re- 
ported QSPR methods necessitate calculation on 3D-structures that is still computationally intensive (Lombardo, F. et 

20 al, J. Med. Chem. 1996, 39, 4750-4755; Palm, K. et al. J. Med. Chem. 1998, 41 , 5382-5392; Clark, D. E. J. Pharm. 
Set. 1 999, 88, 8 1 5 — B21 ). The QSPR methods also necessitate complete set of molecular parameters (van de Water- 
beemd, H. et al. Quant Struct.-Act. Retat 1996, 15, 480 — 490) that require experimental measurements to be deter- 
mined. 

[0004] 2D-fingerprints are frequently employed in molecular similarity/diversity analysis (e.g. ISIS™/Base similarity 
25 search or SYBYL™/Setector), high-volume QSAR analysis (e.g. HQSAR, vide infra), and other drug discovery scenes. 
To date there has been no report on development of 2D-fingerprints descriptors to analyze pharmacokinetic properties. 
[0005] WO 98/07107 discloses a MOLECULAR HOLOGRAM QSAR (HQSAR™) to develop high volume QSAR 
models. HQSAR™ uses molecular hologram based on fragments counts to deal with mostly potency/activity. A sym- 
posium proceeding (Niwa, T "Prediction of Human Intestinal Absorption of Drug Based on Neural Network Modeling"; 
30 27 th Symposium on Structure-Activity Relationships held in Japan, Nov. 10, 1999) describes a method to estimate 
human intestinal absorption (HI A) based on molecular topological indexes derived from 2D-structure. 
[0006] It could be highly desirable to provide a system and method to predict pharmacokinetic properties of actual 
and virtual molecules with high performance (predtotivity and speed) and wide applicability to diverse molecules. 

35 Brief Disclosure of the Invention 

[0007] This invention provides a new method and system for QSPR analysis and prediction based on only 2D-struc- 
ture that allows us to predict hundreds of compounds rapidly. The method and system of this invention employs 2D- 
fingerprints, an array of the counts of functional groups as descriptors for QSPR. 
40 [0008] This invention provides a method for predicting pharmacokinetic properties of molecules comprising the steps 
of: 

(a) preparing 2D-structures of molecules used as a training set; 

(b) constructing a 2D-fingerprint by counting the number of structural descriptors that potentially relate to a phar- 
45 maco kinetic property, either manually or automatically using internally developed macro; wherein said structural 

descriptors consist of predefined 20 to 80 atoms/fragments or substructures; 

(c) analyzing the obtained 2D-fingerprint by a statistical analysis method to correlate with the pharmacokinetic 
property of the molecule to yield a quantitative structure-property relationship (QSPR) model; and 

(d) calculating the pharmacokinetic property of atrial molecule using the above obtained QSPR model. 

50 

[0009] This invention also provides a system for predicting pharmacokinetic properties of molecules comprising: 

(a) means for preparing 2D-structures of molecules used as a training set; 

(b) means for constructing a 2D-fingerprint by counting the number of structural descriptors that potentially relate 
55 to a pharmacokinetic property, wherein said structural descriptors consist of predefined 20 to 80 atoms/fragments 

or substructures; 

(c) means for analyzing the obtained 2D-fingerprint by a statistical analysis method to correlate with the pharma- 
cokinetic property of the molecule to yield a quantitative structure-property relationship (QSPR) model; and 
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(d) means for calculating the pharmacokinetic property of a trial molecule using the above obtained QSPR model. 

[001 0] Another aspect of this invention provides a method wherein the pharmacokinetic property is absorption. 
[001 1] Another aspect of this invention provides a method wherein the pharmacokinetic property is distribution. 
5 [0012] Another aspect of this invention provides a method wherein the pharmacokinetic property is metabolism 
[0013] Another aspect of this invention provides a method wherein the pharmacokinetic property is excretion. 
[0014] Another aspect of this invention provides a method wherein the internally developed macro comprises the 
macro script 2dfp.spl or 2dfp_abs.spl, written in SYBYL™ Programming Language (SPL). 

[001 5] Preferably, each of the steps of the methods of the invention is carried out using molecular modeling software, 
10 databases or drawing software. More preferably one is such as SYBYL™, version 6.5 (Tripos Inc., St. Louis, MO). The 
database includes, for example, ISIS Bases version 2.2.1 (MDL information Systems, Inc. San Leandro, CA.). The 
drawing software includes such as SYBYL m /SKETCH option, ISIS™ Draw version 2.2.1 , Chem Draw Pro™ version 
5.0 (CambridgeSoft. Corp. Cambridge, MA) and SMILES™ (Daylight Chemical Information Systems, Inc., Mission 
Viejo, CA). Other modeling software, databases, and drawing software known to those of skill intheartcan also be used. 
is [001 6] This invention enables us to perform virtual screening for synthetic targets and data mining using databases 
as well as drug design to optimize the pharmacokinetic profiles. Based on the QSPR model in this invention, it is 
possible to predict pharamacokinetic properties of molecules prior to synthesis, without labor-intensive and time-con- 
suming experiment. This invention relies on 2D-fingerprint modeling requiring only 2D-structure, which enables us to 
perform rapid calculation to predict hundreds of compounds without tedious calculation about 3D-structure. Moreover, 
20 2D4ingerprint used in this invention comprises only 20-80 bits. 



Description of Figures 



[0017] 

25 

Figure 1 is a flowchart showing the overall process of the invention. 
Figure 2 shows a plot of actual vs. calculated log t1/2. 
Figure 3 shows a plot of actual vs. calculated log(P w * 10 6 ). 
Figure 4 shows a plot of actual vs. calculated logBB. 

30 

Detailed Disclosure of the Invention 

[0018] The term "molecules used as a training set" as used herein, refers to the molecules whose pharmacokinetic 
properties have been already determined experimentally and used to develop a predictive QSPR model. 
35 [0019] The term "pharmacokinetic properties" as used herein, refers to the properties of molecules related to me- 
tabolism, absorption (permeability), distribution, and excretion (ADME). 
[0020] A number of experimental methods or models are known in ADME. 

[0021 ] Examples of absorption studies are 1 ) kinetic studies based on measuring plasma concentration, urinary fecal 
excretions and gastrointestinal disposition after oral administration in vivo, 2) single-pass perfusion method, recircula- 
40 tion method, loop method in situ, and 3) everted sacs method, methods of using brush border membrane vesicles, 
isolated cells, and cultured cells (Caco-2) in vftroand the like. 

[0022] Examples of distribution studies are 1 ) the method of measuring concentration of target organs after admin- 
istration by various technique such as HPLC, LOMS, autoradiography and microdialysis in vivo, 2) brain perfusion 
methods such as vascular reference method (brain uptake index) in situ, and 3) methods of using isolated cells or 

45 cultured cells (such as endothelial cell) in vitro and the like. 

[0023] Examples of metabolism studies are 1 ) kinetic studies based on measuring concentrations of drugs and the 
metabolites after adequate administration routes such as intravenous administration, administration per portal vain in 
vivo, and in situ, 2) kinetic studies such as a half-life of drugs in mammalian organ (liver, kidney, intestine, etc. with 
slices, homogenates and microsomes etc) and in isolated cells orcultured cells such as hepatocytes in Wfroand the like. 

so [0024] Examples of excretion studies are 1 ) kinetic studies based on measuring concentration of drugs in urine, bile, 
feces etc after administration in vivo, 2) enzymatic studies of excretion via pumps such as P-glycoprotein, in vitro and 
the like. 

[0025] The term "2D-fingerprint" as used herein, refers to a 2D-molecular measure in which a bit in a data string is 
set corresponding to atoms/fragments or substructures. 
55 [0026] The term "predefined atoms/fragments or substructures" as used herein , refers to atoms or functional groups 
relating to a phrmacokinetic property, which are based on the literature source (Bonse, V.G., Metzler, M. "Biotransfor- 
mationen Organischer Fremdsubstanzen" (Yakubutu-Taisha) in Japanese Asakura, Tokyo (1980); Kato, R., Kamatani, 
T. "Yakubutu-Taishagaku" in Japanese, Tokyo-Kagaku-Dojin, Tokyo, chapter 4, 93-123 ,(1995)), otherwise refers to 
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functional groups such as saturated or unsaturated bonds, rings (aromatic or cycloalkyl), amines, anilines, nittrogen 
in aromatics, imines/nitriles/guanidine/amidine, oxyamtne(N-0)/nitro/azoVhydrazin, amide/thioamide/suffonamide/, al- 
cohol/ether/aldehyde/ketone/ester/carbo^^ acid/sulfonic acid, halogen, oxygen or sulfur 

functional groups, and total number of carbon, hydrogen, nitrogen, oxygen, sulfur or phosphorus atom. 

5 [0027] The term "internally developed macro" as used herein, refers to an internally developed Sybyl Programming 
Language (SPL) code. Preferable internally developed macro is as described in Working Examples 4 and 5. 
[0028] The QSPR model based on 2D-fingerprints for metabolism predicts half-life of molecules in a human liver 
microsome mixture with good predictivity. The 2D-fingerprints for absorption are successfully employed to develop a 
higly predictive QSPR model on drug permeability across monolayer Caco-2 cells. Similarly the present 2D-fingerprints/ 

10 PLS modeling can be applied to develop statistically significant QSPR models on blood-brain barrier partitioning of the 
structurally diverse set. Thus, the method of this invention requiring only 2D-structures of the pertinent molecules 
enables to perform virtual screening of synthetic targets and data mining using molecular database as well as drug 
design to optimize the pharmacokinetic profiles. 

[0029] Figure 1 illustrates the method of this invention. This invention will be described in more detail with reference 
is to Figure 1 . Computational modeling studies can be carried out using molecular modeling software, preferably SYBYL™ 
on a Silicon Graphics Octane™ workstation. The method of this invention comprises the following steps: 

(a) 2D-structure of a molecule can be prepared by retrieving from a database such as ISIS™/Base, or by con- 
structing manually with drawing software. The drawing software includes, for example, SYBYL™/SKETCH option 

20 (on the workstation), or ISIS™ Draw, Chem Draw™ and SMILES™ on (PC such as Windows NT client PC). The 

2D-structure thus prepared can be transferred to the workstation, and stored in the molecular database. 

(b) The prepared 2D-structure of a molecule can be imported into molecular modeling software such as SYBYL™ 
as a MOL2 format 2D-fingerprints can be constructed by the use of internally developed macro script 2dfp.spl or 
2dfp_abs.spl, written in SYBYL™ Programming Language (SPL) implemented in SYBYL™, or by manually count- 

25 ing the number of the atoms/fragments or substructures. The macro program converts 2D-structures stored in the 

molecular database as a MOL2format into a SYBYL™ line notation (SLN) format. Subsequently, the macro search- 
es each SLN for the substructures potentially related to a pharmacokinetic property that match the queries de- 
scribed in the macro (as shown in Working Example 4), wherein the queries are predefined as the substructures 
(20 to 80 atoms/fragments). Finally the macro enumerates the substructure counts, and records them as 2D- 

30 fingerprints. 

(c) Statistical analysis is performed to obtain a correlation between the obtained 2D-fingerprints and the pharma- 
cokinetic property. Any analytical method such as partial least square (PLS) algorithm, sample-distance partial 
least squares (SAMPLS; Bush, B. L. et at. J. Computer-Aided Mol. Design, 1993, 7, 587-619), genetic algorithm 
or neural network can be employed to yield an optimal quantitative structure property relationship (QSPR) model. 

35 (d) The pharmacokinetic property for trial molecules can be calculated based on the above obtained QSPR model. 

[0030] The pharmacokinetic properties of the molecule such as absorption, distribution, metabolism and excretion, 
can be apparent permeability coefficients (P^) [cm/sec], blood-brain barrier partitioning ratio {(C^uf/Cuo^) = BB )> 
half-lrfe(T 1/2 ) in mammalian liver microsome and the like. 
40 [0031] The system of this invention can be constructed using appropriate computer hardware such as a Silicon 
Graphics Octane™ workstation and software as described above. 

[0032] This invention will be further described below with reference to the following Working Examples. 
Examples 

45 

Example 1 

Development and validation of QSPR for half life in human liver microsome. 

so [0033] Computational modeling studies were carried out using a Silicon Graphics Octane™ workstation. A conge- 
neric series of 54 compounds of Formula (l)(as shown in the following Table 1 .) with a variety of substituent groups 
were used as a training set for analysis. 
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Table 1. 



# 


A 


R1 


R2 


1a 


cydoheptyl 


piperidinyl 


Ph 


2a 


cydoheptyl 


H 2 N(CH2) 2 0- 


Ph 


3a 


cydoheptyl 


4-aminopiperidyl 


Ph 


4a 


cydoheptyl 


HaNtCHa^O)- 


Ph 


5a 


cydoheptyl 


H2N(CH2)2CONH- 


Ph 


6a 


cydohepten-1-yl 


4-aminopiperidyl 


Ph 


7a 


cydooctyl 


HjjNCHaCONH- 


Ph 


8a 


cydoheptyl 


H^CH^- 


Ph 


9a 


cydoheptyl 


4-aminocydohexyl amino 


Ph 


10a 


cydohepten-1 -yl 


piperazinyl 


Ph 


11 a 


cydoheptyl 


piperazinyt 


Ph 


12a 


cydoheptyl 


H 2 N(CH2)2NH- 


Ph 



13a 


cydoheptyl 


H2NC<CH3)2CH2NH- 


Ph 


14a 


cydoheptyl 


N-methylpiperazinyl 


Ph 


15a 


cydoheptyl 


pipehdinylamino 


Ph 


16a 


cydoheptyl 


4-aminopiperidyl 


CH 3 


17a 


cydoheptyl 


piperidinyl 


CH 3 


18a 


cydoheptyl 


HaNfCHj^oNH- 


Ph 


19a 


cydoheptyl 


4-aminoazetidinyl 


Ph 


20a 


cydoheptyl 


HaNtCH^eNH- 


Ph 


21a 


cydoheptyl 


(CHakNtCH^NH- 


Ph 


22a 


cydooctyl 


N-methylpiperazinyl 


Ph 


23a 


cydoheptyl 


piperazinyl 


isopropyl 


24a 


cydoheptyl 


piperidinecarboxirnidarnide 


Ph 


25a 


cydoheptyl 


H2N(CH2)6NH- 


Ph 


26a 


cydoheptyl 


H2N(CH2)4NH- 


Ph 


27a 


cyclononyl 


amino 


Ph 


28a 


cydoheptyl 


CHaNHtCH^H- 


Ph 


29a 


cydooctyl 


ptperazinyl 


CH 3 


30a 


cydoheptyl 


4-aminopiperidyl 


vinyl 


31a 


cydoheptyl 


isopropyl 


Ph 


32a 


cydoheptyl 


2-guanidinoethyl 


Ph 


33a 


cydoheptyl 


mathanesulfoonyl 


Ph 


34a 


cydoheptyl 


piperidinyloxy 


Ph 


35a 


cydoheptyl 


dimethylamino 


Ph 


36a 


cydoheptyl 


amino 


Ph 


37a 


cydoheptyl 


CHaCONH- 


Ph 


38a 


cydoheptyl 


hydroxypiperidinyl 


Ph 
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(continued) 



39a 


cycloheptyl 


H 2 N(CH2) 3 S0 2 - 


Ph 


40a 


cycloheptyl 


methylamino 


Ph 


41a 


cycloheptyl 


methyl 


Ph 


42a 


cyclooctyt 


piperazinyl 


CH 3 


43a 


cycloheptyl 


isobutyl(NH 2 )CHCONH- 


Ph 


44a 


cycloheptyl 


methyl amino 


CH 3 


45a 


cycloheptyl 


methoxy 


Ph 


46a 


cyclooctyl 


methylamino 


normalpropyl 



47a 


cyclooctyl 


methylamino 


CH 3 


48a 


cyclooctyl 


methylpiperazinyl 


CH 3 


49a 


cycloheptyl 


H 


Ph 


50a 


cyclononyl 


methylamino 


CH 3 


51a 


cyclononyl 


methytpiperazinyl 


CH 3 


52a 


cycloheptyl 


isobuty l(N H^CHCONH- 


CH 3 


53a 


cycloheptyl 


H2N(CH2)2CONH- 


Ph 


54a 


cycloheptyl 


HaNtCHgJgCCONH- 


Ph 



[0034] Half-life (t1/2) in vitro for each compound was determined by HPLC analysis of the reaction mixture with 
human liver microsome. The employed 2D-structu res were retrieved from ISIS™/Base (version2.2.1 , MDL Information 
Systems, Inc., San Leandro, CA) or constructed with ISIS™/Draw (version 2.2.1 , MDL Information Systems, Inc., San 
Leandro, CA) on a WinNT client PC, followed by being transferred to the Octane workstation and stored in a molecular 
database. The 2D-fingerprints were constructed by use of a newly developed macro script 2dfp.spl, written in SYBYL™ 
Programming Language (SPL), which was implemented in SYBYL™ (version 6.5, Tripos Inc., St. Louis, MO). The 
macro program converted 2D-structures stored in the molecular database as MOL or MOL2 format into SYBYL™ line 
notation (SLN) format, and counted the number of the atoms or functional groups that matched queries defined in a 
table described in the macro program. The atoms or functional groups susceptible to be involved in metabolism were 
assigned on the basts of the literature source (Bonse, V. G., Metzler, M. ■Biotransformationen Organischer Fremdsub- 
stanzen" (Yakubutu-Taisya, in Japanese) Asakura, Tokyo (1980); Kato, R.; KamataW, T. "Yakubutu-Taisyagaku" in 
Japanese, Tokyo-Kagaku-Dojin Tokyo (1995)). Partial least square (PLS) algorithm in QSPR module in SYBYL™ was 
employed to correlate the aforementioned 2D-fmgerprints and t1/2 to produce QSPR model. Thirty-eight bits out of 
whole 2D-fingerprints used since 25 bits with all the same value or 0 were dropped. SAMPLS run in crossvalidation 
step (leave-1-out) identified the optimum PLS component as 5 (N = 54, Std. Error_prediction - 0.414; q 2 - 0.518). 
Non-crossvalidation PLS analysis resulted in a significant five-component model with the following statistics: Std. 
Error_Est. = 0.219, r 2 = 0.865, F(nl = 5, n2 = 48) = 61 .3. 

[0035] Figure 2 shows the plot of actual vs. calculated log t1/2 (closed circles). For validation of the present QSPR 
model, the prediction of haff-life for the test set (12 compounds) was performed. As indicated open squares in Figure 
2, the model has a fairly good predictivity, which allows us to prioritize the targets for synthesis. 

Example 2 

Development of QSPR for Caco-2 permeability. 

[0036] Unless otherwise noted similar computational molecular modeling were performed as described in Example 
1 . Table 2 enlists 21 structurally diverse compounds as a training set, whose apparent permeability coefficients (Papp) 
[cm/sec] of a compou nd across Caco-2 cells was used as in literature source (Yee, S. Pharm. Res.1997, 14,763 — 766). 
The counts of substructures to match with the predefined queries were encoded as a array of integers by a similar SPL 
script (2dfp_abs.spl) to afford 2D-fingerprints as descriptors employed in the correlation analysis. SAMPLS run in 
crossvalidation step (leave-1-out) identified the optimum PLS component as 2 (N = 21 , Std. EiTorj)rediction = 0.444; 
q 2 = 0.463). Non-crossvalidation PLS analysis resulted in a significant two-component model with the following statis- 
tics: Std. Error_Est. = 0.254, r 2 = 0.824, F(nl = 2, n2 = 1 8) = 42.1 . Figure 3 shows the plot of actual vs. calculated log 
(Papp # 106). 
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Table 2. 



Training set compounds with apparent permeability. 




app 








Papp'10 6 


Compd. 




Compd. 


Compd. 




(cm/sec) 




(cm/sec) 




(cm/sec) 


Azithromycin 


1.04 


Diazepam 


70.97 


Prazosin 


43.60 


Benzylpenicillins 


1.96 


Erythromycin 


1.80 


Propranolol 


27.50 


Caffeine 


50.50 


Fluconazole 


29.80 


Quinidine 


20.40 


Chloramphenicol 


20.60 


Ibuprofen 


52.50 


Tenidap 


51.20 


Clonidine 


30.10 


Imipramine 


14.10 


Testosterone 


72J27 


Desipramine 


21.60 


Methotrexate 


1.20 


Trovafloxacin 


30.23 


Dexamethasone 


23.40 


Naloxone 


28.20 


Ziprasidone 


12.30 



Example 3 

Development of QSPR for blood-brain barrier partition. 

[0037] Unless otherwise noted, similar molecular modeling was performed as described in Example 1 . Blood-brain 
barrier partitioning ratio, {logtC^/Cy^ - logBB} for 'drug-like" compounds (N = 35, Chart 1) as a training set were 
used as in literature source (Lombardo, F. et al., J. Med. Chem. 1996, 39, 4750—4755 ). The 2D-fingerprints were 
calculated as above example. PLS modeling to correlate 2D-fingerprints with BBB partitioning ratio showed the follow- 
ing statistics. Crossvalidation (SAM PLS, leave-1 -out): the optimum PLS component = 3, N = 35, Std. Error_prediction 
= 0.69; q 2 = 059. Non-crossvalidation: Std. Error_Est. = 0.38, r 2 = 0.78, F (3< 31) = 37. 4. 



EP1 167 969 A2 

Chart 1 . Compounds employed in the analysis, (compound 36 for validation) 
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Chart 1 (continued). Compounds employed in the analysis, (compound 36 for 
validation) 

CN 

10 

It H 

13R,-Br,R,- W ««i*i*r^) 




' ON 

23 



»R 



--C0 




27R» B J W«< 



EP1 167 969 A2 



Chart 1 (continued). Compounds employed in the analysis, (compound 36 for 
validation) 




CONHg 
91 





38 1 
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Example 4 

SPL macro (2dfp.spl) to prepare 2D-flngerprints for half life. 
[0038] 

uims define macro 2dfp sybylbasic yes 

m 

m Set the Source Database, and Column-Names File. 

m 

setvar source %promptif("$l t, "STRING" "MYFILE.MDB" "Source Databascrndb" 
"Database with molecules to be calculated") 

setvar resultsFP %prompti?"$i" "STRING" "Columns.rxt" "Filename storing column 
names" "Text file to store column names") 

m if %not(%mols(*)) 

ffl %dialog_message(ERROR "There are no molecules." "No Molecules") 

>$NULLDEV 

## return 

##endif 

# 
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# Set the molecule area to calculate "2D-FingerPrint" . 

# Note that $current_molarea is defined by the "calling" 

# table when adding a column of data. 
# 

localvar molarea 
if SI 

setvar mol area $1 

else 

setvar mol_area $current_molarea 

endif 



database open $ source read 
## 

## Loop over all molecules in the source database 
## 

for j IN %database(*) 

database get "$j" $mol_area 

# 

# set the SLN expression for the molecular area 
# 

setvar sln_exp %sm($mol_area) 
setvar ARRAY 



# 

« # # items + 1 (compd_num) BITs will be used 

# 

inilllllWlllllttltllt ©compdnum) ################## BIT 1 
^ setvar ARRAY[01 ] %mol_info($mol_area name) 

(I II II II It II Exp-Generatorjread(filelD) looping is another choice 

50 (Ifttmtllltliltlllttimm Unsaturated bonds ########## BIT 2-4 

## fpla) Unsaturated bonds (aromatic} 
setvar query Any: Any 

55 setvar BIT %count(%search2D($sln_exp $query NoDuplicate 0 yes)) 
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setvar ARRAY[02] SBIT 
## fplb) Unsaturated bonds (bouble) 

setvar query Any=Any 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 

setvar ARRAY[03] $BIT 
## fp 1 c) Unsaturated bonds (triple) 

setvar query Any#Any 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 
setvar ARRAY[04] $BIT 

ItltiltitliWiiUinitltltlllll ring (topology) Ml II fill I t I I II (HI M M BIT 5-15 
## @fp2a) 3-membered ring 

setvar query Hev[ 1 ]~Hev~-Hev@ 1 

setvar BIT %coimt(%search2r)(Ssln_exp Squery NoDupIicate 0 yes)) 
setvar ARRAY[05] $BIT 
##@fp2b) 4-membered ring 

setvar query Hev[l]~Hev~Hev~Hev@l 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 
setvar AKRAY[06] SBIT 
##@fp2c) 5-membered ring 

setvar query Hev[l]~Hev~Hev~Hev~Hev@l 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 

setvar ARRAY[07] $BIT 
##@fp2d) 6-membcred ring 

setvar query Hev[ 1 ]~Hev~Hev~Hev~Hcv~Hev@ 1 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 

setvar ARRAY[08] SBIT 
##@fp2e) phenyl ring 

setvar query C[1]:C:C:C:C:C:@1 

setvar BIT %count(%search2D($sln_exp Squery NoDupIicate 0 yes)) 

setvar ARRAY[09] SBIT 
##@ip2i) 7-membered ring 

setvar query Hev[ 1 ]~Hev~Hev~Hev~Hev~Hev~Hev@l 

setvar BIT %count(°/osearch2D($sln_exp Squery NoDupIicate 0 yes)) 

setvar ARRAY[10] SBIT 
##@fp2g) 8-membered ring 

setvar query Hev[ 1 ]~Hev~Hev~Hev~Hev~Hev~Hev~Hev@l 
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setvar BIT %count(%search2D(Ssln_exp $query NoDuplicate 0 yes)) 

setvar ARRAY[1 1] $BIT 
##@fp2h) 9-membered ring 

setvar query Hev[ 1 ]~Hev~Hev~Hev~Hev~Hev~Hev~Hev~Hev@ 1 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[12] $BIT 
##@fp2i) 10-membered ring 

setvar query Hcv[ 1 ]^Hcv~Hcv~Hcv~Hev~Hcv~Hev~Hev~-Hev-Hev@ 1 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[1 3] SBIT 
##@fp2j) 1 1 -membered ring 

setvar query Hev[ 1 ]~Hev~Hev~Hev~Hev~Hev~Hev-Hev~Hev--Hev~Hev@l 

setvar BIT %count(%search2D(Ssln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[14] SBIT 
##@fp2k) 12-membered ring 

setvar query 

Hev[ 1 ]~Hev~Hev-Hev~Hev~Hev~Hev~Hev-Hev~Hev~Hev-'Hev@ 1 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[15] SBIT 

iinifHIIIIIUIIIItllllltlHHI Elements _Ovcrall lilt tW!f Ml II II II II II It BIT 16-22 
##@fp3a) total Hetro atoms 
setvar query Het 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[ 16] SBIT 
##@fp3b) total Halogen 

setvar query Any[is=F3r > a fl 

setvar BIT %count(%search2D($sln exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[ 17] SBIT 
##@fp3c) total N 
setvar query N 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[18] SBIT 
##@fp3d) total NH 

setvar query NH 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[19] SBIT 
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##@fp3e) total O 
setvar query O 

setvar BIT %coimt(%search2D($sln_exp $query NoDuplicate 0 yes)) 
setvar ARRAY[20] $BIT 
##@fp3f) total OH 
setvar query OH 

setvar BIT %count(%searcn2D($sln_exp $query NoDuplicate 0 yes)) 
setvar ARRAYpl] SBIT 
##@fp3g) total S 
setvar query S 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[22] SBIT 

################ Methyl, terminal ############# BIT 23-26 
• ##@fp4a) C-Methyl (omega-Oxidation) 
setvar query C-CH3 

setvar BIT %count(%search2D($sm_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[23] $BIT 
##@fp4b) N-Methyl (N-demethylation) 
setvar query N-CH3 

setvar BIT %count(%search2D($sln_exp $query NoDuplicate 0 yes)) 
setvar ARRAY[24] $BIT 
##@fp4c) O-Methyl (O-dememylation) 
setvar query 0-CH3 

setvar BIT %count(%search2D($sm_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[25] SBIT 
##@fp4d) S-Methyl (S-demethylation) 

setvar query CrB-S[F>Any{NOT==H*] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[26] SBIT 

IliltWlllilllltlltlHIIIItlllf Methylene -CH2- It II II II II till II if It HIM BIT 27-30 

##@rp5a) Methylene group 

setvar query Any[NOT=H*^,0]-CH2-Any[NOT=H*,N,0] 

setvar BIT %coimt(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[27] SBIT 

##@fp5b) N-Methylene 
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setvar query N-CH2-Any[NOT=H*] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[28] $BIT 
##@fp5c) O-Methylene 

setvar query 0-CH2-Any[NOT=H*] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[29] SBIT 
##@fp5d) S-Methylene 

setvar query S[F]-CH2-Any[NOT=H*] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[30] SBIT 

tiltHiltUlililtitlliiUllliitt Methine >CH-, Allyhc/Benzylic H (to be absorbed) 
################ BIT 31-36 
##@fp6a) Methine group 

setvar query Any[NOT=H*^AS]-CH(-Any[NOT=H*^,0^))- 

Any[NOT=H*,N,0,S] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[3 1] $BIT 
##@fp6b) Benzylic H (Ar-CH) (ifPb-CH2, then the count »2) 

setvar query CHC(:Any):Any 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[32] SBIT 
##@fp6c) AlrylH (if CR-CR-CH2, then the count =2) 
setvar query CHC(=C) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[33] SBIT 
##@fp6d) N-Methine 

setvar query N^CH(-Any[NOT=H*]>Ariy[NOT=H*] 

setvar BIT %eount(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[34] SBIT 
##@fp6e) O-Methine 

setvar query OCH(-Any[NOT= : H*})-AnyfNOT=H*] 

setvar BIT %count(%search2D($sb_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[35] SBIT 
##@rp6f) S-Methine 

setvar query S-CH(-Any[NOT=H*]>Aiiy[NOT=H*] 
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setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
sctvar ARRAY[36] SBIT 

mmmmmmmm Nitrogen containing Compounds II It II II li It It Mi HU M BIT 37-49 
mmillllllWtWIimm Amines / Imines / Nitrile ############# BIT 37-46 
##@fp7a) Primary Amines, unbranched 
setvar query NH2CH2 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[37] SBIT 
##@fp7b) Primary Amines, branched 

setvar query NH2CH(Any[NOT=H*])(Any[NOT=H*]) 

setvar BIT %count(%search2D($shi_exp Squery NoDuplicate Oyes)) 

setvar ARRAY[38] SBIT 
##@fp7c) Primary Amines, branched 

setvar query NH2C(Any [NOT=H*])(Any[NOT=H* J)(Any [NOT=H*]) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[39] SBIT 
##@fp7d) Primary Anilines (Ar-NH2) 

setvar query NH2C : Any(: Any[NOT-H*]) 

setvar BIT %count(%search2D($sIn_exp Squery NoDuplicate Oyes)) 

setvar ARRAY[40] SBIT 
##@fp7e) Secondary Amines, 

setvar query NH(C[NOT=C=0])C[NOT=C=0] 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate Oyes)) 

setvar ARRAY[41] SBIT 
##@fp7f) Tertiary Amines 

setvar query N(C[NOT<><)J)(C[NOT=C-0])(C[NOT=C<)]) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[42] SBIT 
##@fp7g) Imines 

setvar query N=C 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[43] SBIT 
##@fp7h) Nitrile 

setvar query C#N 

setvar BIT %count(%search2D($sm_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[44] SBIT 
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##@fp7i) N in aromatics 

setvar query Any[is=N,C]:N:Any[is^N,C] 

setvar BIT %count(%search2D($ slnexp Squery NoDuplicate 0 yes)) 
setvar ARRAY[45] SBIT 
##@fp7j) Guanidine 

setvar query NC(=N)N 

setvar BIT %count(%search2D($sm_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[46] $BIT 



IllllilltHtiitlllfttmilttllH N~Q, Nitro, N-N IWIiililtllllftltfmilltmilillilfllll BIT 47-49 
##@fp7k) NO {Hydroxyamine, Oxime, Hydroxamic acid, ....) 
setvar query N-O 

setvar BIT %count(%search2D{$sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[47] SBIT 
##@fp71) Nitro (count =2), Nitroso (count =1) 
setvar query N(=0) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[48] SBIT 
##@fp7m) N~N 

setvar query N-N 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[49] $BIT 

ttffiiUIHUWiililiilliiiMt Amide, Ester, Sulfonamide ################ BIT 50-52 
##@fp8a) Ester 

setvar query C(=0)OC 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[50] $BIT 
##@rp8b) Amide 

setvar query NC(=0) 

setvar BIT %count(%search2D($sIn_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[51] SBIT 
##@fp8c) Sulfonamide 

setvar query NS(=0X=O) 

setvar BIT %coimt(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[52] SBIT 
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llltllltlllillinilillit ffim Ketone, Aldehyde, Alcohol, Thiol, Sulfide ### BIT 53-59 
##@fp9a) Primary Alcohol 

setvar query CH2(OH)(~Hev) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[53] $BIT 
##@fp9b) Secondary Alcohol 

setvar query CH(OHX~Hev)(~Hev) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[54] $BIT 
##@fp9c) Ketone, Aldehyde 

setvar query Any[is=H,C]CC(=0)(Any[is=H,C3) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 

setvar ARRAY[55J SBIT 
##@fp9d) COOH 

setvar query Any[is=H,qCC(=OXOH) 

setvar BIT %count(%search2D($sln__exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[56] $BIT 
##@fp9e) Sulfide 

setvar query CS[FJC 

setvar BTT %count(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[57] $BTT 
##@^9t) Thiol 

setvar query S[F]H(C) 

setvar BIT %count(%search2D($sln_exp Squery NoDuplicate Oyes)) 
setvar ARRAY[58] $BIT 
##@fp9g) Thiocarbonyl 
setvar query C=S 

setvar BIT %coimt(%search2D($sln_exp Squery NoDuplicate 0 yes)) 
setvar ARRAY[59] SBIT 

echo S ARRAY 

echo $ ARRAY » SresultsFP 

zap $mol_area 

endfor 

database close 
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## Announce of the completion & the location of the results 
echo 

echo "job completed on %system(date) M 

echo "The results is stored SresultsFP as a text file," 

echo "please import if from MSS (table)." 

echo « ==> custom format, space separated, column label used" 

Example 5 



[0039] By use of a simitar method described in example 4 r 2D- Fingerprints for Caco-2 permeability and blood-brain 
barrier partition were prepared based on the following description. 



Fp-ID 


Name 


Query 


Alkyl Amines 


fpla 


Primary 


NH2C[NOT=C=O l C=S,C:Any,Any[IS=C,N]C(=N)N,C[1] 
(AnyfJS=O p S,N]A ny=:AnyAny=@1),C[1](=AnyAny[IS=0, 
S,NJAny=:Any@1)I 


fplb 


Secondary 


NH(C{NOT=C=O l C=S,C:Any t Any[IS=C,N]C(=N)N,C[1] 
(AnytlS=0,S,N]A ny=:AnyAny=@1),C{1 K=AnyAny[IS=O t 
S,N]Any=:Any@ 1 )])(C[NOT=C= 0 ,C=S,C: Any,Any[IS=C, 
N]C(=N)N,C[1 ](AnyfJS=0,S,N]Any=:AnyAny= @ 1 ),C[1 \ 
(=AnyAriyPS=0,S ( N]Any=:Any@1)]) 


fplc 


Tertiary 


N(CrNOT=C=O t C=S,C:Any,AnyEIS=C l N]C(=N)N ) C[1l(Any 
[IS=O.S,N]An y=:AnyAny=@ 1),C[1](=AnyAny[IS=O.S,N] 
Any=:Any@1)D(qNOT=C=0, C=S,C:Any,Any[IS=C,N]C 
(=N)N,C[t](AnypS=0,S,N}Any=:AnyAny=@1) ,C[ 1] 
(=AnyAny[IS=O f S,N]Any= :Any @ 1 )])(C[NOT=C=0,C=S, 
C:Any,Any[ IS=C,NP(=N}N,C[1](AnyllS=O f S,NJAny=: 
AnyAny=©1),C[1](=AnyAny[1 S=0,S,N]Any=:Any@1)]) 


Amines attached to heteroaromatics 


fp2a 


Primary 


C[1 ](Any[IS=0,S I N]Any=:AnyAny=@ 1 )NH2 
C[ 1](=AnyAny[IS=0,S,NlAny=:Any@1)NH2 


fp2b 


Secondary 


C[1 KAny[IS=0,S,N]Any=:AnyAny=@ 1 )NHAny[NOT=H*] 
C[1 ](=AnyAny[IS=0,S,N]Any=:Any @ 1 )NHAny[NOT=H*] 


fp2c 


Tertiary 


C[1 ](Any[IS=0,S,NlAny=:AnyAny=@ 1 )N(Any[NOT=H*]) 
Any[NOT=H*] 

C[1](=AnyAny[IS=0 I S,N]Any=:Anyei)N(Any[NOT=H*]) 
Any[NOT=H*] 


Anilines 


fp3a 


Primary 


NH2C(:Any)(:Any[NOT=H*]) 


fp3b 


Secondary 


NH(C(.Any)(:Any[NOT=H*]))Any[NOT=H*] 


fp3c 


Tertiary 


N(C(:Any)(:Any[NOT=H*]))(Any[NOT=H*I)Any[NOT=H*] 
N(C(:Any)(:Any[NOT=H*J))=C 
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(continued) 



N in aromatics 


fp4a 


6-membered ring 


Anyps=N,C]:N:Anyps=N,C] 


fp4b 


-NH- in heteroaromatics 


N[1]HAny[IS=C,N]=:Any[IS=C,N]Any[IS=C l Nl=:AnyEIS=C l 
Nl-@1 


fp4c 


-N- in heteroaromatics 


N[ I lAnyt^CN^iAnyt^CNJAnyt^CN^iAnyt^C, 
N]-@1 


fp4d 


-N= in heteroaromatics 


N[1 ](AnytlS=0,S t NJAny=:AnyAny=@ 1 ) 
N[1 ](=AnyAny[IS=0,S,N]Any=:Any @ 1 ) 


IminesJ 


tiitrile/ Guanidine/Amidine 


fp5a 


Imines 


Any[IS=C,H,S]N[NOT=N[1](Any[IS=0,S t N]Any=: 
AnyAny=@1),N[1](=A nyAny[IS=O.S,N]Any=:Any@1)]=C 
[NOT=Any[IS=C,N]C(=N)N] 


fp5b 


Nitrite 


C#N 


fp5c 


Guanidine 


N[NOT=C[1 J(Any[IS=0,S,N]Any= AnyAny=@ 1 )N,C[1 ] 
(=AnyAny[IS=O t S ,N]Any=:Any @ 1 )N]C(=N)N[NOT=C{1J 
(AnyflS=0,S,N]Any=:AnyAny=@ 1)N,C{1](=AnyAny 
[IS=0,S,N]Any=:Any @ 1 )IM] 


fp5d 


Amidine (not hotero-aromatics) 


Any[NOT=N]C(=N[NOT=N[1](Any[IS=0,S,NJAny=: 
AnyAny=@ 1 ),N[1 ](= AnyAnytlS=0,S,N]Any=: Any® 1 )])N 


N~0/NitrcJH=WN-N 


fp6a 


Hydroxyamine, Oxime, Hydroxamic acid....) 


N[!r]-0[lr] 


fp6b 


Nitro, Nitroso 


N(=0) 


fp6c 


N=N Azo (not in a ring) 


N=N{NOT=N[1 KAnytlS^O.S.NJAny^: AnyAny=@ 1 ) , N[1 ] 
(=AnyAny[IS=0 ,S,N]Any=:Any@1)] 


fp6d 


N N Hydrazine 


N-N[NOT=N[1 ](Any[IS=0 t S,N]Any=:AnyAny=@ 1 ),N[1 ] 
(=AnyAny [IS=0,S,N]Any=:Any@1)] 


Amide/ Thloamlde/ Sulfonamide 


fp7a 


Amidel (NH 2 -CO) 


NH2C=0 


fp7b 


Amtde2 (R^H-CO) 


AnyfNOT=H1NHC=0 


fp7c 


Amide3 (R 1 R 2 N-CO) 


Any[NOT=H*]N(C=0) Any[NOT=H*] 


fp7d 


Thioamidel (NHa-CS) 


NH2C=S 


fp7e 


Thioamide2 (R r NH-CS) 


Any[NOT=H*]NHC=S 


fpTf 


ThioamideStR! RjJM-CS) 


Any(NOT=H*JN(C=S)Any(NOT=H*] 


fp7g 


Sulf.amidel (NH^Oz) 


NH2S(=0)(=0) 


fp7h 


Sutf.amide2 (^-NHSO^ 


NH(S(=0)=0)Any[NOT=H # ] 


fp7i 


Sulf.amide3 (R^-NSO^ 


N(S(=0)=0)(Any[NOT=H*])Any[NOT=H*] 


Alcohc 


yVEther/Aldehvde/Ketonefcster/Carboxvlic acid/Carbothloic acid/Suffinlc acid/Sutfonic add 


fp8a 


Alcohol 


C{NOT=C=0,C=S](OH) 


fp8b 


Ether 


AnyfNOT=C=0, H*]-0-Any[NOT==C=0,H"] 


fp8c 


Aldehyde 


CCH(=0) 


fp8d 


Ketone 


CC(=0)C 


fp8e 


Ester 


C(=0)OC 
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(continued) 



A Icohol/Ether/Aldehyde/Ketone/Ester/Carboxylic acid/Carbothioic acid/Sutfinic acid/Sulfonic add 


fp8f 


Carboxylic acid 


C(=0){OH) 


fp8g 


Carbothioic O acid 


C(=S)(OH) 


fp8h 


Carbothioic S acid 


C(=0)(SH) 


fp8i 


sutfinic acid 


Any0s=H,C]S[NOT=S(=O)(=O)](=O)(OH) 


fpBj 


sulfonic acid 


Anyps=H,C]S{=0)(=0)(OH) 


Halogen 


fp9a 


Fluoro 


F 


fp9b 


Chloro 


CI 


fp9c 


Bromo 


Br 


fp9d 


lodo 


1 


Total OWWO/S 


tp10a 


total C 


C 


fp10b 


total H 


H 


fp10c 


total N 


N 


fp10d 


total O - 


O 


fp10o 


total S 


s 



Claims 

1 . A method for predicting pharmacokinetic properties of molecules comprising the steps of: 

(a) preparing 2D-structures of molecules used as a training set; 

(b) constructing a 2r>fingerprint by counting the number of structural descriptors that potentially relate to a 
pharmacokinetic property, either manually or automatically using internally developed macro; wherein said 
structural descriptors consist of predefined 20 to 80 atoms/fragments or substructures; 

(c) analyzing the obtained 2D-fingerprint by a statistical analysts method to correlate with the pharmacokinetic 
property of the molecule to yield a quantitative structure-property relationship (QSPR) model; and 

(d) calculating the pharmacokinetic property of a trial molecule using the above obtained QSPR model. 

2. A method of Claim 1 , wherein the pharmacokinetic property is absorption. 

3. A method of Claim 1 , wherein the pharmacokinetic property is distribution. 

4. A method of Claim 1 , wherein the pharmacokinetic property is metabolism 

5. A method of Claim 1 , wherein the pharmacokinetic property is excretion. 

6. A method of Claim 1, wherein the internally developed macro comprises the macro script 2dfp.spl or 2dfp_abs. 
spl, written in SYBYL™ Programming Language (SPL). 

7. A system for predicting pharmacokinetic properties of molecules comprising: 

(a) means for preparing 2D-structures of molecules used as a training set; 

(b) means for constructing a 2D-fingerprint by counting the number of structural descriptors that potentially 
relate to a pharmacokinetic property, wherein said structural descriptors consist of predefined 20 to 80 atoms/ 
fragments or substructures; 

(c) means for analyzing the obtained 2D-fingerprint by a statistical analysis method to correlate with the phar- 
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macokinetic property of the molecule to yield a quantitative structure-property relationship (QSPR) model; and 
(d) means for calculating the pharmacokinetic property of a trial molecule using the above obtained QSPR 
model. 
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(a) Preparation of 2D 
structure of molecule 



(b) Generaiton and storage 
of 2D- fingerprint 




f 


(c) Correlation analysis to 
yield QSPR model 







(d) Prediction of PK 
property 



Predefined table of 
substructures 



Internally developed 
SPL code 



PK related Data 
ex.Tl/2,%HIA,Caco-2 

permeability, BfP ratio 



trial molecules 



Fig.l 
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-2 -1.5 -1 -0.5 0 0.5 



Actual logT1/2 

Fig. 2. Calculated vs. actual log tl/2. 

Calculated values for training set (N = 54) are indicated as closed circles and 
predicted values for test set as open squares. 
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Actual log (Papp* 10 6 ) 



Fig. 3. Calculated vs. actual log(P app * 10 6 ). 
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Actual logBB 



Fig. 4. A plot of actual vs. calcd logBB. 
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